Research Statement – Shafiq Joty
نویسنده
چکیده
The Internet is a great source of human knowledge, but most of the information is in the form of unstructured text. In Natural Language Processing (NLP), we focus on adding structure to this text to uncover relevant information, and to use it in developing end-user application programs. To this end, my primary research goal is twofold: (i) developing NLP tools to automatically understand language phenomena that go beyond the individual clauses or sentences of a text, i.e., the discourse structure of the text; and (ii) exploiting these discourse analysis tools effectively in downstream NLP applications including machine translation, summarization, question answering, and sentiment analysis. One methodology emphasized throughout my research is to first identify the inherent semantic structures in a given problem, and then to develop structured machine learning models to exploit such structures effectively. My work has relied on deep learning for better representation of the input text and on probabilistic graphical models for capturing dependencies in the output.
منابع مشابه
Statement of research interests
With the ever increasing popularity of web technologies, it is very common nowadays for people to discuss events, tasks and personal experiences in social media (e.g., Facebook, Twitter, blogs, fora) and email. These are examples of asynchronous conversations where participants communicate with each other at different times. The huge amount of textual data generated everyday in these conversati...
متن کاملCorrection of: Sleep Quality Prediction From Wearable Data Using Deep Learning
[This corrects the article DOI: 10.2196/mhealth.6562.].
متن کاملImproving graph-based random walks for complex question answering using syntactic, shallow semantic and extended string subsequence kernels
Article history: Received 29 March 2009 Received in revised form 27 September 2010 Accepted 3 October 2010 Available online xxxx
متن کاملThe University of British Columbia at TAC 2008
In this paper we describe the University of British Columbia’s participation in the Text Analysis Conference 2008. This work represents our first submission to the DUC/TAC series of conferences, and we participated in both the summarization tasks: the main update task as well as the pilot task on summarizing blog opinions. We describe our systems in detail and describe our performance in the co...
متن کاملTowards Topic Labeling with Phrase Entailment and Aggregation
We propose a novel framework for topic labeling that assigns the most representative phrases for a given set of sentences covering the same topic. We build an entailment graph over phrases that are extracted from the sentences, and use the entailment relations to identify and select the most relevant phrases. We then aggregate those selected phrases by means of phrase generalization and merging...
متن کامل